Protein families and TRIBES in genome sequence space
نویسندگان
چکیده
منابع مشابه
Protein families and TRIBES in genome sequence space.
Accurate detection of protein families allows assignment of protein function and the analysis of functional diversity in complete genomes. Recently, we presented a novel algorithm called TribeMCL for the detection of protein families that is both accurate and efficient. This method allows family analysis to be carried out on a very large scale. Using TribeMCL, we have generated a resource calle...
متن کاملSYSTERS, GeneNest, SpliceNest: exploring sequence space from genome to protein
We have integrated the protein families from SYSTERS and the expressed sequence tag (EST) clusters from our database GeneNest with SpliceNest, a new database mapping EST contigs into genomic DNA. The SYSTERS protein sequence cluster set provides an automatically generated classification of all sequences of the SWISS-PROT, TrEMBL and PIR databases into disjoint protein family and superfamily clu...
متن کاملProtein families in the metazoan genome.
The evolution of development involves the development of new proteins. Estimates based on the initial results of the genome projects, and on the data banks of protein sequences and structures, suggest that the large majority of proteins come from no more than one thousand families. Members of a family are descended from a common ancestor. Protein families evolve by gene duplication and mutation...
متن کاملPercolation in protein sequence space
The currently known protein sequences are not distributed equally in sequence space, but cluster into families. Analyzing the cluster size distribution gives a glimpse of the large and unknown extant protein sequence space, which has been explored during evolution. For six protein superfamilies with different fold and function, the cluster size distributions followed a power law with slopes bet...
متن کاملCooperative "folding transition" in the sequence space facilitates function-driven evolution of protein families.
In the protein sequence space, natural proteins form clusters of families which are characterized by their unique native folds whereas the great majority of random polypeptides are neither clustered nor foldable to unique structures. Since a given polypeptide can be either foldable or unfoldable, a kind of "folding transition" is expected at the boundary of a protein family in the sequence spac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Nucleic Acids Research
سال: 2003
ISSN: 1362-4962
DOI: 10.1093/nar/gkg495